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Abstract 

CLP(H) is an instantiation of the general constraint logic programming scheme with the con¬ 
straint domain of hedges. Hedges are finite sequences of unranked terms, built over variadic 
function symbols and three kinds of variables: for terms, for hedges, and for function symbols. 
Constraints involve equations between unranked terms and atoms for regular hedge language 
membership. We study algebraic semantics of CLP(H) programs, define a sound, terminating, 
and incomplete constraint solver, investigate two fragments of constraints for which the solver re¬ 
turns a complete set of solutions, and describe classes of programs that generate such constraints. 


To appear in Theory and Practice of Logic Programming (TPLP). 
KEYWORDS'. Constraint logic programming, constraint solving, hedges. 


1 Introduction 

Hedges are finite sequences of unranked terms. These are terms in which function symbols 
do not have a fixed arity: The same symbol may have a different number of arguments 
in different places. Manipulation of such expressions has been intensively studied in re¬ 
cent years in the context of XML processing, rewriting, automated reasoning, knowledge 
representation, just to name a few. 

When working with unranked terms, variables that can be instantiated with hedges 
(hedge variables) are a pragmatic necessity. In (pattern-based) programming, hedge vari¬ 
ables help to write neat, compact code. Using them, for instance, one can extract du¬ 
plicates from a list with just one line of a program. Several languages and formalisms 
operate on unranked terms and hedges. The programming language of Mathematica 


This is an extended version of a paper presented at the Twelfth International Symposium on Func¬ 
tional and Logic Programming (FLOPS 2014), invited as a rapid publication in TPLP. The authors 
acknowledge the assistance of the conference chairs Michael Codish and Eijiro Sumii. 
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(Wolfram 2003) is based on hedge pattern matching. Languages such as Tom (Balland 
et al. 2007), Maude (Clavel et al. 2007), ASF+SDF (van den Brand et al. 2001) provide 
capabilities similar to hedge matching (via associative functions). pLog (Marin and Kut¬ 
sia 2006) extends logic programming with hedge transformation rules, see also (Marin 
and Kutsia 2003). XDuce (Hosoya and Pierce 2003) enriches untyped hedge matching 
with regular expression types. The Constraint Logic Programming schema has been ex¬ 
tended to work with hedges in CLP (Flex) (Coelho and Florido 2004), which is a basis 
for the XML processing language XCentric (Coelho and Florido 2007) and a Web site 
verification language VeriFLog (Coelho and Florido 2006). 

The goal of this paper is to describe a precise semantics of constraint logic programs 
over hedges. We consider positive CLP programs with two kinds of primitive constraints: 
equations between hedges, and membership in a hedge regular language. Function sym¬ 
bols are unranked. Predicate symbols have a fixed arity. Terms may contain three kinds 
of variables: for terms (term variables), for hedges (hedge variables), and for function 
symbols (function variables). Moreover, we may have function symbols whose argument 
order does not matter (unordered symbols): a kind of generalization of the commutativity 
property to unranked terms. As it turns out, such a language is very flexible and permits 
to write short, yet quite clear and intuitive code: One can see examples in Sect. 3. We 
call this language CLP(H), for CLP over hedges. It generalizes CLP (Flex) with function 
variables, unordered functions, and membership constraints. Hence, as a special case, 
our paper describes the semantics of CLP (Flex). Moreover, as hedges generalize strings, 
CLP(H) can be seen also as a generalization of CLP over strings CLP(iS) (Rajasekar 
1994), string processing features of Prolog III (Colmerauer 1990), and CLP over regular 
sets of strings CLP(S*) (Walinsky 1989). 

Note that some of these languages allow an explicit size factor for string variables, 
restricting the length of strings they can be instantiated with. We do not have size factors, 
but can express this information easily with constraints. For instance, to indicate the fact 
that a hedge variable x can be instantiated with a hedge of minimal length 1 and maximal 
length 3, we can write a disjunction x = x Vx = (xi,X 2 ) Vx = {xi,X 2 ,X 3 ), where the 
lower case x’s are term variables. 

Flexibility and the expressive power of CLP(H) has its price: Equational constraints 
with hedge variables, in general, may have infinitely many solutions (Kutsia 2004; 2007). 
Therefore, any complete equational constraint solving procedure with hedge variables is 
nonterminating. The solver we describe in this paper is sound and terminating, hence 
incomplete for arbitrary constraints. However, there are fragments of constraints for 
which it is complete, i.e., computes all solutions. One such fragment is so called well- 
moded fragment, where variables in one side of equations (or in the left hand side of the 
membership atom) are guaranteed to be instantiated with ground expressions at some 
point. This effectively reduces constraint solving to hedge matching (Kutsia and Marin 
2005a; 2005b), plus some early failure detection rules. Another fragment for which the 
solver is complete is named after the Knowledge Interchange Format, KIF (Genesereth 
and Fikes 1992), where hedge variables are permitted only in the last argument positions. 
We identify forms of CLP(H) programs which give rise to well-moded or KIF constraints.^ 


^ Conceptually, such an approach can be seen to be similar to, e.g., Miller’s approach to higher-order 
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We can easily model lists with ordered function symbols and multisets with the help of 
unordered ones. In fact, since we may have several such symbols, we can directly model 
colored multisets. Constraint solving over lists, sets, and multisets has been intensively 
studied, see, e.g., (Dovier et al. 2008) and references there, and the CLP schema can be 
extended to accommodate them. In our case, an advantage of using hedge variables in 
such terms is that hedge variables can give immediate access to collections of subterms 
via unification. It is very handy in programming. 

This paper is an extended and revised version of (Dundua et al. 2014). It is organized 
as follows: After establishing the terminology in Section 2, we give two motivating exam¬ 
ples in Section 3 to illustrate CLP(H). The algebraic semantics is studied in Section 4. 
The constraint solver is introduced in Section 5. The operational semantics of CLP(H) 
is described in Section 6. In Sections 7 and 8, we introduce the well-moded and KIF 
fragments, respectively. Section 9 contains concluding remarks. 


2 Preliminaries 

For common notation and definitions, we mostly follow (Jaffar et al. 1998). The alphabet 
A consists of the following pairwise disjoint sets of symbols: 

• Vj: term variables, denoted \>y x,y, z,.. 

• Vh: hedge variables, denoted by x, y, z,..., 

• Vp: function variables, denoted by A, T, Z,..., 

• -Fj: unranked unordered function symbols, denoted by /u,5u: ^u) • ■ 

• To', unranked ordered function symbols, denoted by /o, <7o, ^o) ■ • ■) 

• V'. ranked predicate symbols, denoted by p,q, ■ ■ ■■ 

The sets of variables are countable, while the sets of function and predicate symbols are 
finite. In addition, A also contains 

• The propositional constants true and false, the binary equality predicate =, and 
the unranked membership predicate in. 

• Regular operators: eps, •, -b, *. 

• Logical connectives and quantifiers: -i, V, A, -fA, 3, V. 

• Auxiliary symbols: parentheses and the comma. 

Function symbols, denoted by f,g, h ,..are elements of the set T = Tu U J-q. A variable 
is an element of the set V = Vj U Vh U Vp- A functor, denoted by F, is a common name 
for a function symbol or a function variable. 

We define terms, hedges, and other syntactic categories over A as follows: 


t 

■■=x\f{H) 1 

X{H) 

Term 

T 

•— ^ 1 5 • ■ • 5 

(n > 0) 

Term sequence 

h 

:= t \x 


Hedge element 

H 

•— hi , • ■ •, hn 

IV 

o 

Hedge 


logic programming (Miller 1991), where the fragment Lx uses unitary unification for higher-order 
patterns instead of undecidable higher-order unification. 
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We denote the set of terms by T{T,V) and the set of ground (i.e., variable-free) terms 
by T{iF). Besides the letter t, we use also r and s to denote terms. 

We make a couple of conventions to improve readability. The empty hedge is written 
as e. The terms of the form a(e) and X(e) are abbreviated as a and X, respectively. We 
put parentheses around hedges, writing, e.g., (f(a),x,b) instead of f(a),x,b. For hedges 
H = {hi,... ,hn) and H' = {h'l,... the notation {H,H') stands for the hedge 

{hi ,..., hn, hi ,..., 

Two hedges are disjoint if they do not share a common element. For instance, (/(a), x, b) 
and {f{x),f{b,f{a))) are disjoint, whereas {f{a),x,h) and (/(&),/(a)) are not, because 
/(a) is their common element. 

An atom is a formula of the form p{ti,... ,tn), where p G 7^ is an n-ary predicate 
symbol. Atoms are denoted by A. 

Regular hedge expressions R are defined inductively: 

R ::= eps I (R - R) I R + R I R* I /(R) 

where the dot • stands for concatenation, -|- for choice, and * for repetition. Primitive 
eonstraints are either term equalities = or membership for hedges '\r\{H, R). They 

are written in infix notation, such as ti = t 2 , and 77 in R. 

A literal L is an atom or a primitive constraint. Formulas are defined as usual. A eon- 
straint is an arbitrary first-order formula built over true, false, and primitive constraints. 

The set of free variables of a syntactic object O is denoted by var{0). We let ByN 
denote the formula 3vi • • • 3vnN, where V = {vi,..., u„} C V. 3vN denotes 3yar(N)\vN■ 
We write 3N (resp. \/N) for the existential (resp. universal) closure of N. We refer to a 
language over the alphabet A as £{A). 

A substitution is a mapping from term variables to terms, from hedge variables to 
hedges, and from function variables to functors, such that all but finitely many variables 
are mapped to themselves. We use lower case Greek letter to denote them. 

For an expression (i.e., a term, hedge, functor, literal, or a formula) e and a substitution 
a, we write ea for the instance of e under cr. This is a standard operation that replaces in e 
each free occurrence of a variable v by its image under cr, i.e., by (j{v). If needed, bound 
variables are renamed to avoid variable capture. For instance, for the constraint C = 
\/x.f{X{a, x),x) = f{g{y, a, b, x), b, x) and the substitution a = {X i—>■ g, T M- (&, x), p i—>■ 
e,x I—>■ /(c)}, we have Ca = \/z.f{g{a,b,x),b,x) = f{g{a,b,z),b,z). A substitution a is 
grounding for an expression e if ea is a ground expression. 

A (constraint logic) program is a finite set of rules of the form V(Li A • • • A > A), 
n > 0, usually written as A Ti,..., T„, where A is an atom and Li,..., Ln are literals 
other than true and false. A goal is a formula of the form 3{Li A • • • A Ln), u > 0, usually 
written as Li,..., where Ti,..., are literals other than true and false. 

We say a variable is solved in a conjunction of primitive constraints A = Ci A • • • A c„, 
if there is a c^, 1 < t < n, such that 

• the variable is x, = x = t, and x occurs neither in t nor elsewhere in /C, or 

• the variable is x, Cj = x = 77, and x occurs neither in 77 nor elsewhere in 1C, or 

• the variable is A, = A = 7^ and A occurs neither in F nor elsewhere in 1C, or 

• the variable is x, Ci = x in /(R) and x does not occur in membership constraints 
elsewhere in A, or 
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• the variable is x, = a; in R, a; does not occur in membership constraints elsewhere 
in /C, and R has the form Ri • R 2 or R^. 

In this case we also say that is solved in 1C. Moreover, /C is called solved if for any 

1 < i < n, Ci is solved in it. /C is partially solved, if for any 1 < i < n, is solved in 1C, 

or has one of the following forms: 

• Membership atom: 

— fu{Hi,x,H 2 ) in /u(R). 

— (x, H) in R where H ^ e and R has the form Ri • R 2 or R*. 

• Equation: 

— {x. Hi) = {y, H 2 ) where x ^y. Hi ^ e and H 2 7 ^ e. 

— (x. Hi) = {T,y, H 2 ), where x ^ var{T), Hi ^ e, and T ^ e. The variables x 

and y are not necessarily distinct. 

— fu{Hi,x,H 2 ) = fu{H 3 ,y,Hi) where {Hi,x,H 2 ) and {H 3 ,y,H 4 ) are disjoint. 

A constraint is solved, if it is either true or a non-empty quantifier-free disjunction of 
solved conjunctions. A constraint is partially solved, if it is either true or a non-empty 
quantifier-free disjunction of partially solved conjunctions. 

3 Motivating Examples 

In this section we illustrate the expressive power of CLP(H) by two examples: the rewrit¬ 
ing of terms from some regular hedge language and an implementation of the recursive 
path ordering with status. 

Example 1 

The general rewriting mechanism can be implemented with two CLP(H) clauses: The 
base case 

rewrite{x,y) -fr- rule(x,y) 
and the recursive case 

rewrite{X(x,x,y), X(x,y,y)) <r- rewrite{x,y), 

where x,y are term variables, x,y are hedge variables, and X is a function variable. It 
is assumed that there are clauses which define the rule predicate. The base case says 
that a term x can be rewritten to y if there is a rule which does it. The recursive case 
rewrites a nondeterministically selected subterm x of the input term to y, leaving the 
context around it unchanged. Applying the base case before the recursive case gives the 
outermost strategy of rewriting, while the other way around implements the innermost 
one. 

An example of the definition of the rule predicate is 

rule{X{xi,X 2 ),X{y)) ^ xi \nf{a*)-b*, xi = {x,z), y = {x,f{z)), 
where the constraint^ xi in /(a*) • b* requires xi to be instantiated by hedges from the 

^ In the notation defined in the previous section, strictly speaking, we need to write this constraint as 
/(a(eps)*) • 6(eps)*. However, for brevity and clarity of the presentation we omit eps here. 
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language generated by the regular hedge expression /(a*) • h* (that is, from the language 
{/, /(a), /(a, a),..., (/, h), {f{a), b),..., (/(a,a), 6 , 

With this program, the goal - 5 — rewrite{f{f{f{a,a),b)),x) has two answer substitu¬ 
tions: {x /(/(/(a, a),/))} and {x /(/(/(a, a),/(&)))}. To obtain them, the goal is 
first transformed by the recursive clause, leading to the new goal ^ rewrite{f {f (a, a), b),y) 
together with the constraint x = f{y) for x. The next transformation is performed by 
the base case of the rewrite predicate, resulting into the goal - 5 — rule{f{f{a,a),b),y). 
This goal is then transformed by the rule clause, which gives the constraint X(xi,X 2 ) = 
/(/(a, a),b) Ay = X{y) A xi in f{a*) -b* A xi = {x',z) Ay = {x', f{z)) Ax = f{y). 
This constraint has two solutions, depending whether xi equals f{a,a) or to (/(a, a), 6 ). 
From one we get x = f{f{f{a, a), /)), and from the other x = /(/(/(a, a), /(&))). These 
solutions give the above mentioned answers. 

Example 2 

The recursive path ordering (rpo) >rpo is a well-known term ordering (Dershowitz 1982) 
used to prove termination of rewriting systems. Its definition is based on a precedence 
order on function symbols, and on extensions of >rpo from terms to tuples of terms. 
There are two kinds of extensions: lexicographic >(po, when terms in tuples are compared 
from left to right, and multiset >(^o^ when terms in tuples are compared disregarding 
the order. The status function r assigns to each function symbol either lex or mul status. 
Then for all (ranked) terms s, t, we define s >rpo t, if s = /(si,. •., Sm) and 

1 . either Si = t or Si >rpo t for some Sj, 1 < f < m, or 

2. t = g{ti ,..., tn), s >rpo U for alH, 1 < i < n, and either 

(a) f F g, or (h) f = g and (si,..., s„) >1^^ (ti,..., t„). 

To implement this definition in CLP(H), we use the predicate rpo for >rpo between 
two terms, and four helper predicates: rpo-all to implement the comparison s >rpo ti for 
all i; pree to implement the comparison depending on the precedence; ext to implement 
the comparison with respect to an extension of >rpo; and status to give the status of 
a function symbol. The predicate lex implements >(“ and mul implements >'^■ The 
symbol () is an unranked function symbol, and {} is an unordered unranked function 
symbol. As one can see, the implementation is rather straightforward and closely follows 
the definition. >rpo requires four clauses, since there are four alternatives in the definition: 

1. rpo{X{x,x,y),x). 

rpo{X{x,x,y),y) A- rpo{x,y). 

2a. rpo{X{x), Y{y)) A- rpo.all{X(x), (y)), pree{X, Y). 

2b. rpo{X{x),X{y)) rpo.all{X(x), (y)), ext{X(x), X(y)). 
rpo_all is implemented with recursion: 
rpo-all{x, ()). 

rpo.all{x,{y,y)) <- rpo{x,y),rpo.all{x, (y)). 

The definition of pree as an ordering on finitely many function symbols is straightforward. 
More interesting is the definition of ext: 

ext{X{x), Xllj)) ■(— status{X, lex), lex({'x), {'y)). 
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ext{X{x), X{y)) •(— status{X, mul), mul({x}, {2/}). 
status can be given as a set of facts, lex needs one clause, and mul requires three: 

lex{{x,x,y), {x, y,z)) <- rpo{x, y). 
mul{{x,'x}, {}). 

mul{{x,x},{x,y}) mul{{x}, {y}). 
mul{{x,x},{y,y}) ^ rpo{x,y), mul{{x,x},{y}). 

That’s all. This example illustrates the benefits of all three kinds of variables we have 
and unordered function symbols. 


4 Algebraic Semantics 

For a given set S, we denote by S* the set of hnite, possibly empty, sequences of elements 
of S, and by S'” the set of sequences of length n of elements of S. The empty sequence 
of symbols from any set S is denoted by e. Given a sequence s = (si, S2,..., s„) S S”, 
we denote by perm{s) the set of sequences {(s.n.(i), s,r(2)j • ■ •) s,r(n)) | tt is a permutation 
of {l,2,...,n}}. 

A structure © for a language C{A) is a tuple {D,I) made of a non-empty carrier set 
of individuals and an interpretation function / that maps each function symbol f G IF to 
a function /(/) : D* D, and each n-ary predicate symbol p € 7^ to an n-ary relation 
I{p) ^ -D”. Moreover, ii f € Fu then /(/)(s) = /(/)(s') for all s £ D* and s' £ perm{s). 
A variable assignment for such a structure is a function with domain V that maps term 
variables to elements of D, hedge variable to elements of D*, and function variables to 
functions from D* to D. 

The interpretations of our syntactic categories w.r.t. a structure © = {D, I) and vari¬ 
able assignment cr is shown below. The interpretations |i7]e,CT of hedges (including terms) 
is defined as follows: 

Me.CT := o'(w), where v G Vj U Vh. 

[/(i7)l6.. := limme,.)- 
lX{H)je,a ■■= a{X){me,a). 

[(hi, . . . , hn)J©,(T ■= ([hlJSjO-) ■ • ■ ) [hn]©,cr)- 

Note that terms are interpreted as elements of D and hedges as elements of D*. We 
may omit cr and write simply [i?]© for the interpretation of a ground expression E. The 
interpretation of regular expressions is defined as follows: 

[epsje := {e}. 

[/(R)]© := {J(/)(i7)|iJG[R]©}. 

[Ri -I- R2]© := [Ri]© U IR2]©. 

[Ri • R 2 I© := I Hi £ [Ril©,i72 G [R 2 I©}. 

[Rle := [R]©. 

Primitive constraints are interpreted with respect to a structure © and variable assign- 
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merit a as follows: 

&\=atl= t 2 iff ltlje,cr = lt2je,cr- 

6 Ki? in Riff Me.. e[Rle. 

6 \=a piti,...,tn) iff /(p)(|tll6,.,---,[f«le,.)- 

The notions © |= for validity of an arbitrary formula in 6, and |= N for validity 
of N in any structure are defined in the standard way. 

An intended structure is a structure 3 with the carrier set T(T) and interpretations I 
dehned for every / G by := f{H). Thus, intended structures identify terms 

and hedges by themselves. Also, if R is any regular hedge expression then [RJj is the 
same in all intended structures, and will be denoted by |R]. Other remarkable properties 
of intended structures 3 are: Variable assignments are substitutions, 3 ti = t^ iff 
ti'd = t 2 '&, and U i? in R iff H'd £ |R]. 

Given a program P, its Herbrand base Bp is, naturally, the set of all atoms p{ti ,..., t„), 
where p is an n-ary user-dehned predicate in P and (ti,...,t„) G T(-A)”. Then an 
intended interpretation of P corresponds uniquely to a subset of ,Bp. An intended model 
of P is an intended interpretation of P that is its model. 

As usual, we will write P ^ G if G is a goal which holds in every model of P. Since 
our programs consist of positive clauses, the following facts hold: 

1. Every program P has a least intended model, which we denote by lm{P). 

2. If G is a goal then P ^ G iff lm{P) is a model of G. 

A ground substitution d is an intended solution (or simply solution) of a constraint C 
a 3 \= Cd for all intended structures 3. 

Theorem 1 

If the constraint C is solved, then 3 \= 3C holds for all intended structures 3. 


5 Solver 

In this section we present a constraint solver for quantifier-free constraints in DNF. It 
is based on rules, transforming a constraint in disjunctive normal form (DNF) into a 
constraint in DNF. We say a constraint is in DNF, if it has a form Afi V • ■ • V where 
/C’s are conjunctions of true, false, and primitive constraints. The number of rules is not 
small (as it is usual for such kind of solvers, cf., e.g., (Dovier et al. 2000; Comon 1998)). 
To make their comprehension easier, we group them so that similar ones are collected 
together in subsections. Within each subsection, for better readability, the rule groups 
are put between horizontal lines. 

Before going into the details, we introduce a more conventional way of writing ex¬ 
pressions, some kind of syntactic sugar, that should make reading easier. Instead of 
Pi() = P2() and fo{Hi) = fo{H 2 ) we write Pi = P2 and Hi = H 2 respectively. The sym¬ 
metric closure of the relation = is denoted by The rules are applied in any context, 
i.e., they behave as rewrite rules. Moreover, when a rule applies to a conjunction of the 
form L A /C, it is intended to act on an entire conjunct of the DNF, modulo associativity 
and commutativity of A. These assumptions guarantee that the constraint obtained after 
each rule application is again in DNF. 
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Logical Rules. 

There are eight logical rules which are applied at any depth in constraints, modulo 
associativity and commutativity of disjunction and conjunction. N stands for any formula. 
We denote the whole set of rules by Log. 


N /\ N N 

false A fV false 
true A N ^ N 
H = H true 


N V N N 

false V ^ TV 

true V N true 

e in R -w true, if e € [R] 


Failure Rules. 

The first two rules perform occurrence check, rules (F3) and (F5) detect function symbol 
clash, and rules (F4), (F6), (F7) detect inconsistent primitive constraints. We denote the 
set of rules (F1)-(F7) by Fail. 


(FI) a; ~ (i7i, F(i7), 772 ) false, it x G var{H). 
(F2) X ~ (idi, 7 , 772 ) false, if x € wor(77i, 7, 7 / 2 ). 
(F3) /i(77i) ~ 72(772) false, if /i ^ /2. 

(F4) e ~ (77i,7,772) false. 

(F5) /i(77) in/2(R)-false, if/i ^/ 2. 

(F6) e in R false, if e ^ |RJ. 

(F7) (77i, 7,772) in eps false. 


Decomposition Rules. 

The set of these rules is denoted by Dec. They operate on a conjunction of literals and 
give back either a conjunction of literals again, or a constraint in DNF. 


(Dl) /,(77)^/,(T)A/C^ V (77 = r'A/C), 

T' ^perm{T) 

where 77 and T are disjoint. 

(D2) (7i, 77i) ~ (72, H2) 7i = 72 A 77i = 772, where Hi ^ e or H2 ^ e. 
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Deletion Rules. 

These rules delete identical terms or hedge variables from both sides of an equation. We 
denote this set of rules by Del. 


(Dell) ~ (T,i72)i?i = i^ 2 . 

(Del2) ^ 

(Dels) X ~ {Hi,x, H2)Hi = eAH2 = e, if Hi ^ e. 


Variable Elimination Rules. 

These rules eliminate variables from the given constraint keeping only a solved equa¬ 
tion for them. They apply to disjuncts. The first two rules replace a variable with the 
corresponding expression, provided that the occurrence check fails: 

(El) x'^tAK.~^x = tA KM, 

where x ^ var{t), x € varlJC) and rl = {x 1-^ <}. If t is a variable then 
in addition it is required that t € var{lC). 

(E2) x-^H MC-^x = H AlCd, 

where x ^ var{H), x S var{IC), and d = {x H}. li H = y for some 
y, then in addition it is required that y € var{IC). 

The next two rules (E3) and (E4) assign to a variable an initial part of the hedge in 
the other side of the selected equation. The hedge has to be a sequence of terms T in 
the first rule. The disjunction in the rule is over all possible splits of T. In the second 
rule, only a split of the prefix T of the hedge is relevant and the disjunction is over all 
such possible splits of T. The rest is blocked by the term t due to occurrence check: No 
instantiation of x can contain it. 

(E3) (x, iJ) - T A /C Y (x = Ti A Hd = T 2 AlCd^, 

'r=(Ti,T2) 

where x ^ var{T), -d = {x Ti}, and H ^ e. 

(E4) (x,iIi)~(r,t,i72 )A/C-> Y (x = Ti Ai?id= (T2,t,i?2 )i?AO) 

T=(ri,T2) 

where x ^ var{T), x G var{t), d = {x 1—>■ Ti}, and Hi ^ e. 


Finally, there are three rules for function variable elimination. Their behavior is stan¬ 
dard: 


(E5) X ~ F AK,-^ X = F AlCd, 

where X ^ F, X € var{X), and d = {X 1—>■ F}. If F" is a function 
variable, then in addition it is required that F G var{X). 
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(E6) X{Hi) ~ F{H2) a /C X = F a F{Hi)^ = F{H2)^ A O. 
where X ^ F, d = {X n- F}, and Hi e or H 2 ^ e. 

(E7) X{Hl)~X{H2)^X--^ y (^X = fAf{Hi)d = f{H2)d A Xd), 

/e^ 

where d = {X >->• /}, and Hi H 2 . 

We denote the set of rules (E1)-(E7) by Elim. Note that the assumption of finiteness 
of F guarantees that the disjunction in (E7) is finite. 


Membership Rules. 

The membership rules apply to disjuncts of constraints in DNF, to preserve the DNF 
structure. They provide the membership check, if the hedge H in the membership atom 
77 in R is ground. Nonground hedges require more special treatment as one can see. 

To solve membership constraints for hedges of the form (7,77) with t a term, we rely 
on the possibility to compute the linear form of a regular expression, that is, to express it 
as a finite sum of concatenations of regular hedge expressions that identify all plausible 
membership constraints for t and 77. Formally, the linear form of a regular expression R, 
denoted lf{R), is a finite set of pairs (/(Ri), R 2 ), which is defined recursively as follows: 

Ifieps) = 0 . 

^/(./(R)) = {(/(R),eps)}. 

If {Ri + R 2 ) = If (Ri) U If {R2). 

If {Ri-R 2) = If {Ri)(DR2, ife^ [Ril- 

If (Ri ■ R 2 ) = If (Ri) © R 2 U //(R 2 ), if e G |Ril. 

lfiR*) = lfiR)QR*. 

These equations involve an extension of concatenation 0 that acts on a linear form 
and a regular expression and returns a linear form. It is defined as I 0 eps = Z, and 
Z 0 R = {(/(Ri), R 2 • R) I (/(Ri), R 2 ) G Z, R 2 ^ eps} U {(/(Ri), R) | (/(Ri), eps) G Z), if 
R 7 ^ eps. 

The linear form Z/(R) of a regular expression R has the property (Antimirov 1996):^ 

IR1\M= U I/(Ri)-R2l, (LF) 

(/(Ri),R2)ei/(R) 

which justifies its use in the rule M2 below. 

The first group of membership rules looks as follows: 


(Ml) {xi, ..., Xn) in eps A F Xi = e A Xd, 

where d = {xi 1 —>■ e,..., i-G e}, n > 0. 


® In (Antimirov 1996), this property has been formulated for word regular expressions, but it straight¬ 
forwardly extends to regular hedge expressions we use in this paper. 
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(M2) {t, H) m R FK,\J (t in f{Ri) A H \n R 2 A JC^ , 

(/(Ri).R 2 )ei/(R) 
where H ^ e and R 7 ^ eps. 

(M3) {x,H)\n f{R) AK:-^ 

in /(R) AFl = e AJCj V (^ = e A F[ in /(R) A JCj , 
where H ^ e. 

(M4) < in R* < in R. 

(M5) < in Ri • R 2 A /C ^ in Ri A e in R 2 A V in Ri A i in R 2 A /C^ 

(M6) i in Ri + R 2 A /C in Ri A V (t in R 2 A 

(M7) {x, iJ) in Ri 4- R 2 A /C ^ (^(x, H) in Ri A /C^ V ^(S, FI) in R 2 A 

(M8) z; in Ri A ri in R 2 in R, 

where z; G Vj U Vh, |R] = |Ri] H IR 2 ], and neither v in Ri nor z; in R 2 
can be transformed by the other rules. 

Next, we have rules which constrain singleton hedges to be in a term language. They 
proceed by the straightforward matching or decomposition of the structure. Note that in 
(M12), we require the arguments of the unordered function symbol to be terms. (MIO) 
and (M9) do not distinguish whether / is ordered or unordered: 

(M9) X in /(R) AJC-^x = xAxin /(R) A K.{x i-A x}, where x is fresh. 

(MIO) X{H) in /(R) AK.^X = f A f{H){X ^ /} in /(R) A X{X ^ /}. 

(Mil) /o(i7) in/o(R) ^ in R. 

(M12) /,(T)in/,(R)A/C-^ Y (T'inRA/c). 

T' ^perm{T) 

We denote the set of rules (M1)-(M12) by Memb. 

5.2 The Constraint Solving Algorithm 

In this section we present an algorithm that converts a constraint with respect to the 
rules specified in Section 5.1 into a partially solved one. First, we define the rewrite step 

step := first(Log, Fail, Del, Dec, Elim, Memb). 

When applied to a constraint, step transforms it by the first applicable rule of the 
solver, looking successively into the sets Log, Fail, Del, Dec, Elim, and Memb. If none of 
them apply, then the constraint is said to be in a normal form with respect to step. 

The constraint solving algorithm implements the strategy solve defined as a repeated 
application of the rewrite step, aiming at the computation of a normal form with respect 
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to step. But it also makes sure that the constraint, passed to step, is in DNF: 
solve compose(dnf, NF(step)). 

Hence, solve takes a quantifier-free constraint, transforms it into its equivalent con¬ 
straint in DNF (the strategy dnf in the definition stands for the algorithm that does it), 
and then repeatedly applies step to the obtained constraint in DNF as long as possible. 
It remains to show that this definition yields an algorithm, which amounts to proving 
that the strategy NF(step) indeed produces a constraint to which none of the rules from 
Log, Fail, Del, Dec, Elim, and Memb apply. The termination theorem states exactly this: 

Theorem 2 (Termination of solve) 

solve terminates on any quantifier-free constraint. 

With the next two statements we show that the solver reduces a constraint to its 
equivalent constraint: 

Lemma 1 

If step(C) = V, then U ^ V^C O 3 var(C)T^ for all intended structures 3. 

Theorem 3 

If solve(C) = V, then 3 ^ V^C O 3yg,r{C)T^ for all intended structures 3, and V is either 
partially solved or the false constraint. 


6 Operational Semantics of CLP(H) 

In this section we describe the operational semantics of CLP(H), following the approach 
for the CLP schema given in (Jaffar et al. 1998). A state is a pair (G || C), where G 
is the sequence of literals and C = /Ci V ■ • • V /C„, where /C’s are conjunctions of true, 
false, and primitive constraints. The definition of an atom p{ti,... Cm) in program P, 
defnp^{p{ti ,..., tm)), is the set of rules in Pr such that the head of each rule has a form 
p{ri ,..., Cm). We assume that defnp^ each time returns fresh variants. 

A state (Li,..., || C) can be reduced with respect to P as follows: Select a literal Li. 

Then: 

• If Li is a primitive constraint and solve(C A Li) C false, then it is reduced to 
(Z/i,..., Li—\j ..., Lji II solve(C A Li)). 

• If Li is a primitive constraint and solve(C A Li) = false, then it is reduced to 
(□ II false). 

• If Li is an atom p{ti,... ,tm), then it is reduced to 

(Z/i,..., Li—Ijti — r 1,..., tm — rB^ Li.i-ij ... || C) 

for some (p(ri,... ,rm) ^ B) G defnp^{Li). 

• If Li is a atom and defnp^{Li) = 0, then it is reduced to (□ || false). 

A derivation from a state S' in a program Pr is a finite or infinite sequence of states 
So ^ Si ^ ^ S„ ^ • where So is S and there is a reduction from each Si-i to 

Si, using rules in Pr. A derivation from a goal G in a program Pr is a derivation from 
(G II true). The length of a (finite) derivation of the form So ^ Si ^ ^ S„ is n. A 
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derivation is finished if the last goal cannot be reduced, that is, if its last state is of the 
form (□ II C) where C is partially solved or false. If C is false, the derivation is said to be 
failed. 

Naturally, it is interesting to find syntactic restrictions for programs guaranteeing that 
non-failed finished derivations produce a solved constraint instead of a partially solved 
one. In the next two sections we consider such restrictions, leading to well-moded and 
KIF style CLP(H) programs that have the desired property. 


7 Well-Moded Programs 

The concept of well-modedness is due to (Dembinski and Maluszynski 1985). A mode 
for an n-ary predicate symbol p is a function rup : {1,... ,n} —> {i) 0 }. If mp{i) = i 
(resp. mp{i) = o) then the position i is called an input (resp. output) position of p. The 
predicates in and = have only output positions. For a literal L = plfi,... ,tn) (where p 
can be also in or =), we denote by invar{L) and outvar(L) the sets of variables occurring 
in terms in the input and output positions of p. 

If a predicate is used with different modes rUp ,..., m* in the program, we may consider 
each p^i^ as a separate predicate. Therefore, we can assume without loss of generality 
that every predicate has exactly one mode (cf., e.g., (Ganzinger and Waldmann 1992)). 

An extended literal E is either a literal, true, or false. We define invar{true) := 0, 
outvar{twe) := 0, mwar(false) := 0, and outvar{fa\se) 0. 

A sequence of extended literals Ei,, En is well-moded if the following hold: 

1. For all I < i < n, invar{Ei) C outvar{Ej). 

2. If for some 1 < i < n, Ei is ti = t 2 , then var{ti) C 1J*_^ outvar{Ej) or var{t 2 ) Q 
U}=i outvar{Ej). 

3. If for some 1 < i < n, Ei is a. membership atom, then the inclusion var{Ei) C 
U}=i outvar{Ej) holds. 

A conjunction of extended literals G is well-moded if there exists a well-moded sequence 
of extended literals Ai, ..., such that G = A^i modulo associativity and commu¬ 
tativity of conjunction. A formula in DNF is well-moded if each of its disjuncts is. A 
state {Li,... ,Ln || /Ci V • • - W K-m) is well-moded, where /C’s are conjunctions of true, false, 
and primitive constraints, if the formula (Li A • • • A L„ A /Ci) V • • • V (Li A • • • A A /Cm) 
is well-moded. 

A clause A ^ Li,..., L„ is well-moded if the following hold: 

1. For all I < i < n, invar{Li) C lJ*Zi outvar{Lj) U invar{A). 

2. outvar(A) C outvar{Lj) U invar(A). 

3. If for some I < i < n, is = t 2 , then var{ti) C lj*_i outvar{Lj) U invar{A) or 
var{t 2 ) C 1J*_^ outvar{Lj) U invar{A). 

4. If for some 1 < / < n, is a membership atom, then outvar{Li) C lj}=i outvar(Lj) 
U invar {A). 

A program is well-moded if all its clauses are well-moded. 
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Example 3 

In Example 1, if in the user-defined binary predicates rewrite and rule the first argument 
is the input position and the second argument is the output position, then it is easy to 
see that the program is well-moded. In Example 2, for well-modedness we need to define 
both positions in the user-defined predicates to be the input ones. 

In the rest of this section we investigate the behavior of well-moded programs. Before 
going into the details, we briefly summarize two main results: 

• The solver can completely solve satisfiable well-moded constraints (instead of par¬ 
tial solutions computed in the general case). See Theorem 4. 

• Any finished derivation from a well-moded goal with respect to a well-moded pro¬ 
gram either ends with a completely solved constraint, or fails. See Theorem 5. 

To prove these statements, some technical lemmas are needed. 

Lemma 2 

Let V = e be an equation, where u is a variable and e is the corresponding expression 
such that V does not occur in e. Let ICi and IC 2 be two arbitrary (possibly empty) 
conjunctions of extended literals such that the conjunction /Ci A/C 2 Aw = e is well-moded. 
Let 0 = {w i-A e} be a substitution. Then ICi A /C20 A w = e is also well-moded. 

The next lemma states that reduction with respect to a well-moded program preserves 
well-modedness of states: 

Lemma 3 

Let Pr be a well-moded CLP(H) program and (G || C) be a well-moded state. If (G || 
C) ^ (G' II C) is a reduction using clauses in Pr, then (G' || C) is also a well-moded 
state. 

Corollary 1 

If C is a well-moded constraint, then solve(C) is also well-moded. 

The following theorem shows that satisfiable well-moded constraints can be completely 
solved: 

Theorem 4 

Let C be a well-moded constraint and solve(C) = C', where C' 7 ^ false. Then C' is solved. 

We illustrate how to solve a simple well-moded constraint: 

Example 4 

Let C = /(x, a, y) = f(a, b, a, c, c) A/(z, a, x) = /(y, x) Ay in c(eps)*. Then solve performs 
the following derivation (some steps are contracted): 

C (x = e A (a,y) = (a,b,a,c,c) A f(z,a,x) = f{y,x) Ay in c(eps)*) 

V(x = a A (a,y) = {b,a,c,c) A f{z,a,x) = /(y,x) A y in c(eps)*) 

V(x = (a, 6 ) A (a,y) = (a, c,c) A f(z,a,x) = f{y,x) Ay in c(eps)*) 

V(x = {a,b,a,c,c) A (a,y) = eAf(z,a,x) = f{y,x) Ay in c(eps)*) 
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The obtained constraint is solved. 


The next theorem is the main result for well-moded CLP(H) programs. It states that 
any finished derivation from a well-moded goal leads to a solved constraint or to a failure: 


Theorem 5 

Let (G II true) ^ • ■ • ^ {□ || C) be a finished derivation with respect to a well-moded 
CLP(H) program, starting from a well-moded goal G. If C false, then C is solved. 


8 Programs in the KIF Form 

Knowledge Interchange Format, shortly KIF (Genesereth and Fikes 1992), is a computer- 
oriented language for the interchange of knowledge among disparate programs. It permits 
variadic syntax and hedge variables, under the restriction that such variables are only the 
last arguments of subterms they appear in. Such a fragment has some good computation 
properties, e.g., unification is unitary (Kutsia 2003). The special form of programs and 
constraints considered in this section originates from this restriction. 

Terms and hedges in the KIF form or, shortly, KIF terms and KIF hedges, are defined 
by the following grammar: 

::= a: I/o(i^«;) I/u(^«;i, • ■ • I (n > 0) KIF Term 

::= t/ii, • • ■ I ^Ki, ■ • • (n > 0) KIF Hedge 

That means that a term is in the KIF form if hedge variables occur only below ordered 
function symbols as the last arguments. For example, the terms foix, fo{a,x), fu{x, b),x) 
and fo(a,x,b) are in the KIF form, while fo{x,a,x) and fu{x, fo{a,x), fu{x,b),x) are not. 

If the language does not contain unordered function symbols, then we permit hedge 
variables under function variables, again in the last position, i.e., of the form X{Hff). 

In this section we consider only KIF terms. Therefore, the subscript k will be omitted. 

KIF equations and KIF atoms are constructed from KIF terms. In a KIF membership 
atom H \n R, the hedge iL is a KIF hedge. 

KIF formulas are constructed from KIF primitive constraints and KIF atoms. This 
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special form guarantees that the solver does not need to use all the rules. Simply inspect¬ 
ing them, we can see that Dell, E3, E4, and M3 are not used. In Del3, it is guaranteed 
that H 2 will be always empty, and in Ml the n will be equal to 1. 

Similarly to the well-moded restriction above, our interest to the KIF fragment is 
justified by its two important properties that characterize the KIF constraint solving 
and derivation of KIF goals: 

• The solver can completely solve satisfiable KIF constraints (instead of partial solu¬ 
tions computed in the general case). See Theorem 6 . 

• Any finished derivation from a KIF goal with respect to a KIF program either ends 
with a completely solved constraint, or fails. See Theorem 7. 

Their proofs are easier than the ones of the corresponding statements for well-moded 
programs. This is largely due to the following lemma: 

Lemma 4 

Any partially solved KIF constraint is solved. 

One can see that no solving rule inserts a term or a hedge variable after the last 
argument of subterms in constraints. That means, KIF constraints are again transformed 
into KIF constraints. Hence, the constraint computed by solve will be a KIF constraint. 
It leads us to the following result: 

Theorem 6 

Let C be a KIF constraint and solve(C) = C', where C 7 ^ false. Then C is solved. 

We illustrate now how to solve a simple KIF constraint: 

Example 5 

Let C = f{x,x) = f{g{y),a,y) A x in a(eps)* A y in a(eps) • a(6(eps)*)’*'. Then solve 
performs the following derivation: 

C X = g{y) A X = (a, y) A x in a(eps)* A y in a(eps) • a(6(eps)*)* 

X = y(y) A X = (a, y) A (a, y) in a(eps)* A y in a(eps) • a(6(eps)*)* 

X = y(y) A X = (a, y) A y in a(eps)* A y in a(eps) • a(6(eps)*)* 

X = y(y) A X = (a, y) A y in a(eps) • a(eps)* 

The obtained constraint is solved. 

A state {Li, ..., || Afi V • • • V /Cm) is in the KIF form {KIF state), if the formula 

(Li A • • • A L„ A /Cl) V • • • V {Li A • • • A A /Cm) is a KIF formula. 

KIF clauses are constructed from KIF atoms and literals. KIF programs are sets of KIF 
clauses. It is not hard to check that each reduction step (with respect to a KIF program) 
in the operational semantics preserves KIF states: It follows from the definition of the 
operational semantics and the fact that solve computes KIF constraints. Therefore, we 
can establish the following theorem: 

Theorem 1 

Let (G II true) ^ • • • ^ (□ || C') be a finished derivation with respect to a KIF program, 
starting from a KIF goal G. If C 7 ^ false, then C is solved. 
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Example 6 

The well-known technique of appending two difference lists can be used in CLP(H) for a 
more general task: to combine arguments of arbitrary two terms. The program remains 
the same as in the standard logic programming: 

append-dl{xi-X2, X2-X3, X1-X3), 

where the hyphen is a function symbol and xi,X 2 ,X 3 are term variables. The KIF goal 

append_dl{fi{a,b,x)-f 2 {x), f 2 {c,d,e,y)-f 3 {y), x-fs) 

can be used to append to the arguments of /i(a, b) the arguments of / 2 (c, d, e), obtaining 
/i(a, b, c, d, e). Note that the terms may have different heads. The derivation proceeds as 
follows: 


{append.dl{fi{a,b,x)-f2{x), f2{c,d,e,y)-f3{y), x-fs) || true) 

^ {xi-X2 = fi{a, b,x)-f 2ix), X2-X3 = /2(c,d, e,y)-/3(y), X1-X3 = x-fs || true) 
^ (a: 2 -a ;3 = h{c,d,e,y)-f3{y), X1-X3 = x-fs || xi = fi{a,b,x) Ax2 = h^)) 

^ {X1-X3 = x-f3 II 

xi = fi{a,b,c,d,e,y) Ax 2 = f2ic,d,e,y) Ax3 = f3{y) Ax = (c,d,e,y)) 

^ (□ II 

xi = /i(a, 6, c, d, e) Ax2 = f2{c,d, e) A X3 = f3 Ax = {c,d,e) Ay = e A 
X = fi{a,b, c,d, e)). 

The constraint in the final state is solved. 


9 Conclusion 

Solving equational and membership constraints over hedges is not an easy task: The 
problem is infinitary and any procedure that explicitly computes all solutions is non¬ 
terminating. The solver that we presented in this paper is not complete, but it is termi¬ 
nating. It solves constraints partially and tries to detect failure as early as it can. 

Incorporating the solver into the CLP schema gives CLP(H): constraint logic program¬ 
ming for hedges. We defined algebraic semantics for it and used it to characterized the 
constraint solver: The output of the solver (which is either partially solved of false) is 
equivalent to the input constraint in all intended structures. 

The fact that the solver, in general, returns a partially solved result (when it does not 
fail), naturally raises the question: Are there some interesting fragments of constraints 
that the solver can completely solve? We give a positive answer to this question, defining 
well-moded and KIF constraints and showing their complete solvability. 

It immediately poses the next question: Can one characterize CLP(H) programs that 
generate only well-moded or KIF constraints only? We show that by extending the notions 
of well-modedness and KIF form to programs, we get the desired fragments. Any finished 
derivation of a goal for such fragments gives a definite answer: Either the goal fails, or a 
solved constraint is returned. 

The constraints we consider in this paper are positive, but at least the well-moded 
programs can be easily enriched with the negation. Well-modedness guarantees that the 
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eventual test for disequality or non-membership in constraints will be performed on 
ground hedges, which can be effectively decided. 
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Theorem 1 

If the constraint C is solved, then 3 \= 3C holds for all intended structures 3. 


Proof 

Since C is solved, each disjunct K, in it has a form vi = ei A • • • A = e„ A in 
Ri A • • • A in where m,n > 0, Vi,v'j G V and e* is an expression corresponding 
to Vi- Moreover, rii, are distinct and [Rj] ^ 0 for all 1 < j < to. Note 

that while Vi's do not occur anywhere else in /C, it still might be the case that some v', 
1 < j < TO, occurs in some e^, 1 < fc < n. 

Let e' be an element of |Rj| for all 1 < j < to. Assume that for each 1 < i < n, the 
substitution cr' is a grounding substitution for with the property that v'cr' = e' for 
all 1 < J < TO. Then a = {vi i—>■ eicrj, ... ,Vn >—t CnCr!^, e'l,... ,v!^ i-A e^} solves 1C. 
Therefore, 3 \= 3C holds. □ 

Theorem 2 (Termination of solve) 

solve terminates on any quantifier-free constraint. 


Proof 

We need to show that NF(step) terminates for any quantifier-free constraint in DNF. We 
define a complexity measure cm{C) for such constraints, and show that cm{C') < cm{C) 
holds whenever C = step(C). 

For a hedge H (resp., for a regular expression R), we denote by size{H) (resp., by 
si 2 e(R)) its denotational length, e.g., size{e) = 0, si 2 e(eps) = 1, size{f(f(a)),x) = 4, 
and size{f{f{a ■ b*))) = 6 . 

The complexity measure cto(/C) of a conjunction of primitive constraints /C is the tuple 
(Ai, Ml, A 2 , M 2 , Ms) defined as follows ({||} stands for a multiset): 

Ni is the number of unsolved variables in /C. 

Ml := ^size{H) \ H \n R G JC, H ^ e|}. 

N 2 is the number of primitive constraints in the form T in R in /C . 

M 2 := {|sz 2 e(R) | iJ in R G /C|}. 

M 3 := {|si2e(ti) -I- size{t 2 ) | ti = <2 G /C|}. 

The complexity measure cm(C) of a constraint C = /Ci V • • • V /C„ is defined as 
{|cm(/Ci),..., cto(/C„)|}. 

Measures are compared by the multiset extension of the lexicographic ordering on 
tuples. The components that are natural numbers (Ai and N 2 ) are, of course, compared 
by the standard ordering on naturals. The multiset components Mi, M 2 , and M 3 are 
compared by the multiset extension of the standard ordering on the naturals. 

The strict part of the ordering on measures is obviously well-founded. The Log rules 
strictly reduces it. For the other rules, the table below shows which rule reduces which 
component of the measure. The symbols > and > indicate the strict and non-strict 
decrease, respectively. It implies the termination of the algorithm solve. 
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Rule 


7Vi Ml N^\ Mo\ Me, 


(Ml), (MIO), (E1)-(E7) 

> 



(F5), (F7), (M2), (M3), (M8), (Mil), (M12) 

> 

> 


(M9) 

> 

> 

> 

(F6), (M4)-(M7) 

> 

> 

> 

(Dl), (D2), (F1)-(F4), (Dell)-(Del3) 

> 

> 

> 


> 

> 


> 


□ 

Lemma 1 

If step(C) = V, then Of ^ V^C O 3var{c)T^'^ for all intended structures 0. 

Proof 

By case distinction on the inference rules of the solver, selected by the strategy first in 
the application of step. We illustrate here two cases, when the selected rules are (E3) and 
(M2). For the other rules the lemma can be shown similarly. 

In (E3), C has a disjunct K. = {x, H) = T AIC' with x fL var{T), and V is the result of 
replacing /C in C with the disjunction C' = Vt=(Ti T 2 )(^ = Ti A Hd = T 2 A JC'd) where 
r? = {x M- Ti}. Therefore, it is sufficient to show that 0 \= V(/C 3yar(C)^')- Since 
var(C') = var{IC), this amounts to showing that for all ground substitutions a of var(lC) 
we have 0 ^ {xa, Fla) = Ta A IC'a iff 0 ^ i\/T={Ti t-A}^ = Ti A Hd = T 2 A 1C'd))a. 

Assume 0 ^ {xa, Ha) = TaAlC'a. We can split Ta into Tia and T 2 CT such that xa = Tia 
and Ha = T 2 CT. Now, we show vda = va for all v S var{x, H,T). Indeed, \ivf^x, the 
equality trivially holds, v = x, we have xda = Tier = xa. Hence, 1 \= (Vt=(Ti ^ 
TiAH§ = T 2 AlC'd))a. 

Assume 3 \= (Vt=(Ti T 2 )(^ = Ti A Hd = T 2 A lC''d))a. Then there exists the split 
T = {Ti,T 2 ) such that 3 ^ (xa = Tia A Hda = T 2 a A IC'da). Again, we can show 
Vila = va for all v G var{x, H,T). Hence, 3 ^ (xa,Ha) = TaAfC'a. It finishes the proof 
for (E3). 

Now, let the selected rule be (M2). In this case C has a disjunct /C = {t,H) in R A 
IC' with H e and R ^ eps. Then V is the result of replacing Af in C with C = 
V(/(Ri),R2)ei/(R)(^ /(^i) A TT in R2 A IC'). Therefore, to show 3 ^ V(C O it 

is enough to show that 3 |= V(/C -fG 3yar(C)C')- Since var(C') = var(IC), this amounts to 
showing that for all ground substitutions cr of var{IC) we have 3 ^ {ta,Ha) in R A IC'a 
iff 3 h (V(/(Ri),R2)ei/(R)(t in ./(Ri) A TT in R2 A IC'))a. 

Assume 3 \= (ta,Ha) in R A IC'a. By the property (lf) above and by the definitions 
of intended structure and entailment, we get that 3 \= (ta,Ha) in R A IC'a implies 
3 \= (ta, Ha) in lf{R) AlC'a . Hence, we can conclude 3 ^ (V(/(Ri) R 2 )Gi/(R)(^®’ A 

Ha in R 2 A IC'a)). 

Assume 3 ^ (V(/(Ri) R 2 )GJ/(R)(fo /(Ri) A Ha in R 2 A IC'a)). Then we have 3 |= 
(ta, Ha) in lf(R) A IC'a which, by (lf), implies 3 |= (ta, Ha) in R A IC'a. 
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Theorem 3 

If solve(C) = V, then 3 ^ V^C O 3var(C)T^'^ for all intended structures 3, and V is either 
partially solved or the false constraint. 

Proof 

We assume without loss of generality that C is in DNF. 3 ^ V(C e-)- 3yar{C)'^) follows 
from Lemma 1 and the following property: If fJ ^ V(Ci ^var{Ci)C 2 ) and 3 h V(C 2 O 
3 -oar(C 2 )^ 3 ) ) then 3 1= V(Ci -f-?- 3yi^r(Ci)C3) ■ The property itself relies on the fact that 
3 ^ '^(3var(Ci}^var(C2)^3 ^ ^var{Ci)C 3 ), which holds becausc all variables introduced by 
the rules of the solver in C 3 are fresh not only for C2, but also for Ci. 

As for the partially solved constraint, by the definition of solve and Theorem 2, V is in 
a normal form. Assume by contradiction that it is not partially solved. By inspection of 
the solver rules, based on the definition of partially solved constraints, we can see that 
there is a rule that applies to V. But this contradicts the fact that is in a normal form. 
Hence, V is partially solved. □ 

Lemma 2 

Let u = e be an equation, where u is a variable and e is the corresponding expression 
such that V does not occur in e. Let JCi and IC2 be two arbitrary (possibly empty) 
conjunctions of extended literals such that the conjunction /Ci A/C2 Av = e is well-moded. 
Let 0 = {u i-A e} be a substitution. Then )Ci A IC2O Av = e is also well-moded. 

Proof 

The point in this lemma is that it does not matter how ICi and IC2 are chosen. We 
consider two cases. First, when v = e is the leftmost literal containing u in a well-moded 
sequence corresponding to /Ci A /C2 A u = e and, second, when this is not the case. 

Case 1 . Let Ei,v = e, E2 be a well-moded sequence corresponding to /Ci A /C2 A u = e, 
such that El does not contain v. Note that there is no assumption (apart from what 
guarantees well-modedness of /Ci A /C2 A u = e) on the appearance of literals in Ei and 
E2: They may contain literals from /Ci only, from IC2 only, or from both ICi and /C 2 . 

Well-modedness of i?i,u = e,i ?2 requires the variables of e to appear in Ei. Consider 
the sequence Ei,v = e, E 2 [ 0 ], where the notation E[ 6 ] stands for such an instance of E in 
which 9 affects only literals from /C 2 . Then Ei,v = e is well-moded and it can be safely 
extended by E2\9] without violating well-modedness, because the variables in u = e still 
precede (in the well-moded sequence) the literals from E 2 [ 0 ], and the relative order of the 
other variables (in the well-moded sequence) does not change. Hence, Ei,v = e,E2[9] is 
a well-moded sequence that corresponds to /Ci A /C 20 Av = e. 

Case 2 . Let Ei, L, E2,v = e, E^ be a well-moded sequence corresponding to /Ci A Af2 A 
V = e, where L is the leftmost literal that contains v in an output position. Again, we 
make no assumption on literal appearances in the subsequences of the sequence. Then 
Ei,L,v = e,if2, A3 is also a well-moded sequence (corresponding to Afi A /C2 A u = e), 
because v still appears in an output position in L left to u = e, the variables in e still 
precede literals from A 3 , and the relative order of the other variables does not change. 
For literals in A2 that contain variables from e such a reordering does not matter. 

Note that v does not appear in Ai: If it were there in some literal in an output position, 
then L would not be the leftmost such literal. If it were there in some literal L' in an 
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input position, then well-modedness of the sequence would require v to appear in an 
output position in another literal L" that is even before L', i.e., to the left of L and it 
would again contradict the assumption that L is the leftmost literal containing v in an 
output position. 

Let El, L[9\,v = e, E2[0], E^ [0] be a sequence of all literals taken from /CiA/C 2 Ar! = e. 
We distinguish two cases, depending whether 6 affects L or not. 

9 ajfects L. Then it replaces v in L with e, i.e., L[9\ = L9. Then the variables of e appear 
in output positions in L9 and, hence, placing v = e after L9 in the sequence would not 
destroy well-modedness. As for the L9 itself, we have two alternatives: 

1. L9 is an equation, say s = t9, obtained from L = {s = t)\>y replacing occurrences 
of in t by e. In this case, by well-modedness oi Ei,L,v = e, A 2 , A 3 , variables of 
s appear in Ei and s does not contain v. Then the same property is maintained 
in El, L9, V = e, E 2 [d], E^\9], since s remains in L9 and Ei does not change. 

2. L9 is an atom. Then replacing ri by e in an output position of L, which gives L9, 
does not affect well-modedness. 

Hence, we got that Ei,L,v = eis well-moded. Now we can safely extend this sequence 
with E 2 [9 ], A 3 [ 0 ], because variables in new occurrences of e in A 2 [ 0 ], A 3 [ 0 ] are preceded 
hy V = e, and the relative order of the other variables does not change. Hence, the 
sequence Ei,L9, v = e, A 2 [ 0 ], A 3 [ 0 ] is well-moded. 

0 does not affect L. Then A[0] = L, the sequence Ei,L,v = e is well-moded and it can 
be safely extended with A 2 [ 0 ],A 3 [ 0 ], obtaining the well-moded sequence Ei, L,v = e, 
A2[0],A3[0]. 

Hence, we showed also in Case 2 that there exists a well-moded sequence of literals, 
namely, Ei,L[9],v = e, A 2 [ 0 ], A 3 [ 0 ], that corresponds to fCi A /C 20 Av = e. Hence, fCi A 
/C 20 A = e is well-moded. □ 

Lemma 3 

Let Pr be a well-moded CLP(H) program and (G || C) be a well-moded state. If (G || 
C) ^ (G' II C) is a reduction using clauses in Pr, then (G' || C) is also a well-moded 
state. 

Proof 

Let G = Li, ..., Aj,..., A„, C = /CiV- • -VlCm, and (G || C) be a well-moded state. We will 
use the notation G for the conjunction of all literals in G, i.e., G = Ai A • • • A A • • • A A„. 
Assume that Li is the selected literal in reduction that gives (G' || C') from (G || C). We 
consider four possible cases, according to the definition of operational semantics: 

Case 1. Let Li be a primitive constraint and C ^ false. Let V denote the DNF of 
CAL,. 

In order to prove that (G' || C) is well-moded, by the definition of solve, it is sufficient 
to prove that (G' || step(A>)) is well-moded. Since, obviously, (G' || V) is a well-moded 
state, we have to show that state well-modedness is preserved by each rule of the solver. 

Since C ^ false, the step is not performed by any of the failure rules of the solver. For 
the rules M1-M8, M11-M12, Dl, and D2, it is pretty easy to verify that (G' || step(A>)) is 
well-moded. Therefore, we consider the other rules in more detail. We denote the disjunct 
of V on which the rule is applied by /Cx>. The cases below are distinguished by the rules: 
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Del. Here the same variable is removed from both sides of the selected equation. Assume 
1 , s = t ,2 is a well-moded sequence corresponding to G' A /Cp, and s = t is the selected 
equation affected by one of the deletion rules. Well-modedness of i,s = t ,2 requires 
that the variable deleted at this step from s = t should occur in an output position 
in some other literal in i. Let s' = t' be the equation obtained by the deletion step 
from s = t. Then i, s' = t ',2 is again well-moded, which implies that G' A step(A^X)) is 
well-moded and, therefore, that (G' || step(2?)) is well-moded. 

M 9. Let G' A /Cp be represented asG'Axin /(R)A/C', where x in / (R) is the membership 
atom affected by the rule. Note that then G' Ax = x Ax in /(R) A /C' is also well- 
moded. Applying Lemma 2, we get that G' Ax = x A x in /(R) A ICO is well-moded, 
where 0 = {x ^ x}. Then we get well-modedness of G' A step(/C'p), which implies 
well-modedness of (G' || step(X>)). 

MIO. Let G'A/C-D be represented as G' AX{H) in /(R) A/C', where X{H) in /(R) is the 
membership atom affected by the rule. Note that then G' AX(H) in /(R) AA = f AlC 
is also well-moded. Applying Lemma 2, we get that G' AX {H)9 in /(R) AX = f AK,'9 
is well-moded, where 9 = {X i-^- /}. But it means that G' A step(/Cx>) is well-moded, 
which implies that (G' || step(X>)) is well-moded. 

El, E2. For these rules, well-modedness of G' A step(/C-p) is a direct consequence of 
Lemma 2. 

E3. Let G' A Xv be represented as G' A {x,Hi) ~ A /C', where {x,Hi) ~ H 2 is the 
equation affected by the rule and x ^ var{H 2 ). Then G' Ax = H' A Hi = H" A X' is 
also well-moded for some H' and H” with (iJ', H") = iJ 2 - Applying Lemma 2, we get 
that G' Ax A H' A Hi9 = H" A X'9 is well-moded, where 9 = {x 1 -^ H'}. Since H' 
and H” were arbitrary, it implies that G' A step(/Cx)) and, therefore, (G' || step(X>)) is 
well-moded. 

E4. Similar to the case of the rule E3. 

Case 2. Let Li be a primitive constraint and C = false, where C = solve(C A Li). Then 
by the operational semantics we have G' = □ and the theorem trivially holds, since the 
state {□ II false) is well-moded. 

Case 3. Let Li be an atom p(ti ,... ,tk, ■ ■ ■ ,ti). Assume that Pr contains a clause of the 
form p{ri,... ,rk, ■ ■ ■ ,ri) A- B, where B denotes the body of the clause. Assume also that 
for the predicate p, the set {1 ,..., fc} is the set of the input positions and {k + 1,... ,1} 
is the set of the output ones. Then we have 

G ,..., Li—I^pfti ^... ^ ti)^ Li^i-i ^..., L ^, 

G — Li,..., Li—I jti — ri ^. Ck — ^k^ - •' iti — r/,B,Lj^i,..., Lji , 

C’ = C — Xi V ■ • • V Xjri- 

From well-modedness of the state (G || C) we know that for all 1 < j < m, the literals 
from Li,...,Lj_i ,... ,Ln and Xj can be reordered in two sequences of literals j 
and ^ in such a way that the sequence j,p(ti ,... ,tk, - ■ ■ is well-moded. Then we 
have var(ti,..., tk) Q outvarQ). Therefore, we obtain that the sequence 

],ti = n,... ,tfc = rfe,| (Al) 

is well-moded for all 1 < j < m. 

From well-modedness of p{ri, ...,rfc,...,n) e- B we know that var{rk+i,... ,ri) C 
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outvar(B>) U varlji,... ,rk)- By item 1 of the definition of program well-modedness, the 
literals of B can be put into a well-moded sequence, written, say, as Bi,..., Bq, such 
that for each 1 < u < q and v G invar{Bu) we have v € outvar{Bu>) for some u' < u, or 
V G var{ri ,..., rfc). From then we can say that the sequence 

ti =ri,...,tk = rk,Bi,...,Bq, tk+i = rk+i, ...,ti=ri (A2) 

is well-moded. 

From (Al) and (A2), by the definition of well-modedness, we can conclude that 

B\,, Bq , tfc-i-i — ,... (-^^) 

is well-moded for all 1 < j < m. By construction, the literals in (A3) are exactly those 
from G' A ICj for 1 < j < to. It means that (G' || ICj) is well-moded for all 1 < j < to, 
which implies that (G' || C') is well-moded. 

Case 4- If defnp{Li) = 0, then G' = □, C' = false, and the theorem trivially holds. □ 
Corollary 1 

If C is a well-moded constraint, then solve(C) is also well-moded. 

Proof 

By the definition of well-modedness, since C is well-moded, the state (a = a || C) is also 
well-moded, where a is an arbitrary function symbol. By the operational semantics, we 
have the reduction (a = a || C) ^ (□ || solve(a = a A C)). By Lemma 3, we get that 
(□ II solve(a = a A C)) is also well-moded and, hence, solve(a = a A C) is well-moded. 
By the definition of solve and the rules of the solver, it is straightforward to see that 
solve(a = a A C) = solve(C). Hence, solve(C) is well-moded. □ 

Theorem 4 

Let C be a well-moded constraint and solve(C) = C, where C false. Then C is solved. 
Proof 

By the Corollary 1, the constraint C is well-moded. If C' is true then it is already solved. 
Consider the case when C is not false. Let C = JCi V ■ ■ ■ V JCm- Since C false, by the 
Theorem 3 C is partially solved. It means that each lCj, \ < j < to, is partially solved and 
well-moded. By definition, ICj is well-moded if there exists a permutation of its literals 
Cl,..., Ci,..., c„ which satisfies the well-modedness property. Assume Ci,..., Ci_i are 
solved. By this assumption and the definition of well-modedness, each of Ci, ..., Ci_i is 
an equation whose one side is a variable that occurs neither in its other side nor in any 
other primitive constraint. Then well-modedness of Kj guarantees that the other sides 
of these equations are ground terms. Assume by contradiction that is partially solved, 
but not solved. If is a membership constraint, well-modedness of ICj implies that Ci 
does not contain variables and, therefore, can not be partially solved. Now let be an 
equation. Since all variables in Ci, ... ,Ci_i are solved, they can not appear in c^. From 
this fact and well-modedness of ICj, Ci should have at least one ground side. But then it 
can not be partially solved. The obtained contradiction shows that C is solved. □ 
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Theorem 5 

Let (G II true) >^ • • • ^ (□ || C) be a finished derivation with respect to a well-moded 
CLP(H) program, starting from a well-moded goal G. If C yf false, then C is solved. 

Proof 

We prove a slightly more general statement: Let (G || true) ^ > (G' || C') be a 

derivation with respect to a well-moded program, starting from a well-moded goal G and 
ending with G' that is either □ or consists only of atomic formulas without arguments 
(propositional constants). If C false, then C is solved. 

To prove this statement, we use induction on the length n of the derivation. When 
n = 0, then C = true and it is solved. Assume the statement holds when the derivation 
length is n, and prove it for the derivation with the length n -|- 1. Let such a derivation 
be (G II true) ^ ^ (G„ || C„) ^ (G„+i || C„+i). Assume that G„+i that is either □ 

or consists only of propositional constants. According to the operational semantics, there 
are two possibilities how the last step is made: 

1. G„ has a form (modulo permutation) L,pi,... ,pm, m > 0, where L is primitive con¬ 
straint, the p’s are propositional constants, G„+i = pi,... ,Pm, and Cn+i = solve(C„ A L). 

2. G„ has a form (modulo permutation) q,pi,... ,Pmi m > 0, where q and p’s are propo¬ 
sitional constants, the program contains a clause q ■(— qi,...,qk, k > 0, where all qi, 
1 <i <k, are propositional constants, G„_|_i = qi,... ,qk,pi, ■ ■ ■ ,Pm, and C„+i = C„. 

In the first case, by the n-fold application of Lemma 3 we get that (G„ || C„) is well- 
moded. Since the p’s have no influence on well-modedness (they are just propositional 
constants), C„ A L is well-moded and hence it is solvable. By Theorem 4 we get that if 
C„+i = solve(C„ A L) 7 ^ false, then C„+i is solved. 

In the second case, since Gn consists of propositional constants only, by the induction 
hypothesis we have that if C„ is not false, then it is solved. But C„ = Cn+i- It finishes 
the proof. □ 

Lemma f 

Any partially solved KIF constraint is solved. 

Proof 

Let /C be a partially solved conjunction of primitive constraints. Then, by the definition, 
each primitive constraint c from fC should be either solved in 1C, or should have one of 
the following forms: 

• Membership atom: 

— fu{Hi,x,H 2 ) in /u(R). 

— {x, H) in R where H ^ e and R has the form Ri • R 2 or R*. 

• Equation: 

— {x, Hi) = {y, H 2 ) where x Hi ^ e and H 2 ^ e. 

— {x,Hi) = {T,y,H 2 ), where x ^ var(T), Hi ^ e, and T ^ e. The variables x and y 
are not necessarily distinct. 

— fu{Hi,x,H 2 ) = fu{H 3 ,y,H 4 ) where {Hi,x,H 2 ) and {H 3 ,y,H 4 ) are disjoint. 
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However, c is also a KIF constraint. By the definition of KIF form, none of the above 
mentioned forms for membership atoms and equations are permitted. Hence, c is solved 
in 1C and, therefore, /C is solved. It implies the lemma. □ 

Theorem 6 

Let C be a KIF constraint and solve(C) = C, where C ^ false. Then C' is solved. 

Proof 

By Theorem 3, C should be in a partially solved form. It is also in the KIF form, as we 
noted above. Then, by Lemma 4, C is solved. □ 

Theorem 1 

Let (G II true) >-^ • • • ^ (□ || C') be a finished derivation with respect to a KIF program, 
starting from a KIF goal G. If C ^ false, then C is solved. 

Proof 

Since the reduction preserves KIF states, C is in the KIF form. Since the derivation is 
finished and C ^ false, by the definition of finished derivation, C is partially solved. By 
Lemma 4, we conclude that C is solved. □ 



